- Published on
NSX-T Cheatsheet and Troubleshooting
- Authors
- Name
- Jackson Chen
VMware NSX-T Data Center Troubleshooting
Enter privilege mode and command mode
st en # NSX manager privilege mode
esxcli # ESXi host to enter esxcli command mode
su -i # KVM privilege mode
NSX-T Troubleshooting
https://www.simongreaves.co.uk/nsx-t/
Check L2 before L3
# Retrieve all available commands to query and configure
list
# API query and configure (Postman, Insomnia)
http GET # Query
http PUT # Create
PATCH # Update
POST # Update
http DELETE # Delete
# PowerCLI
Connect-VIServer -Server <vCenterServer> # Connect to vCenter
Check (L2)
1. MTU
2 VLAN
3. TEP
IP
MTU
4. CCP
N-VDS settings (L3)
1. MTU (L2)
2. Routing table (L4)
3. TEP
4. vTEP tables
5. MAC tables
Manager Troubleshooting
1. CorfuDB3
2. nodes
3. Quorum must be up, at least 2 corfu servers required for quorum
4. Group Member Leader Election Server (GMLE) helps in detecting the fault with an NSX Manager node failure. It also helps elect a new leader per group.
5. Day 2 OperationsUse st en to enter engineering mode (root privileged mode)
Logs
## NSX Manager
Component Log Files and Locations
NSX Policy Manager /var/log/policy/policy.log
NSX Manager /var/log/syslog
/var/log/proton/nsxapi.log
/var/log/nsx-audit.log
manager.log
get log-file manager # can only be view from CLI command
NSXAPI Logs /var/log/proton/nsxapi.log
CorfuDB logs /var/log/corfu # directory contains CorfuDB logs
Cluster BootstrapManager (CBM) /var/log/cbm # directory contains cbm logs
General Logs /var/log/syslog # syslog messages file
NSX Controller /var/log/cloudnet/nsx-ccp.log
Audit Logs /var/log/nsx-audit.log
## ESXi host
ESXi host /var/log/nsx-syslog
/var/log/esxupdate.log
/var/log/nsxa-opsagent.log
/var/log/syslog
/var/log/vmkernel.log
/var/log/nsx-proxy.log
## KVM host
KVM host /var/log/vmware/nsx-syslog # syslog message file
/var/log/syslog
/var/log/openvswitch/ovswitchd.log
/var/log/dpkg.log
DFW /var/log/dfwpktlogs.log (only fills if logging enabled on rule)
# Edge nodes
Edge Nodes Syslog
(get log-file syslog)
/var/log/syslog
Load Balancer errors Access-log [follow] # Using follow to show log as it is being updated
Error-log [follow]
get load-balancer <lb-uuid> error-log # Example to view error log
## NSX Cli command
get log-file <fiilename>
get log-file <filename> follow
# Below are commonly used log files, there are many more log files
get log-file <auth.log | controller | controller-error | http.log | kern.log | manager.log | node-mgmt.log | policy.log | syslog> [follow]
# use [follow] to continuing monitor
Example: get log-file syslog follow
## Distributed firewall logs
# ESXi host /var/log/nsx-syslog.log
/var/log/dfwpktlogs.log
# KVM host /var/logs/vmware/nsx-syslog
/var/log/dfwpktlogs.log
Set logging level on NSX Manager with
set logging-server <remote-syslog-server>:514 proto [udp|tcp] level [info|debug]
Set service manager logging-level debug
Other Logging Information
Log Message IDs
Infrastructure Preparation Logs
Policy Manager logs
View with get log-file policy.log
get log-file syslog
Controller Log
CFG Agent Log
(ESXi)
KVM
Syslog
Configure Syslog Exporter
Using vRLI with NSX (vRealize Log Insight)
1. https://<vRLI>/admin
2. Install log insight content pack for NSX-T data center
get logging-server # Verify logging configuration
set logging-server <syslog-hostname or ip>:port proto [protocol] level <level>
# ESXi host
esxcli system syslog config set --loghost=udp://<syslog-ip>:<port>
esxcli system syslog reload
# KVM
Edit /etc/rsyslog.d/<vmware log>.conf
*.* @<syslog-ip>:514
service rsyslog restart
Protocols Supported
TCP
UDP
TLS
Severity Level
1. Emergency
2. Alert
3. Critical
4. Error
5. Warning
6. Notice
7. Informational
8. Debug
Management and Edge Node configuration
set logging-server <hostname-or-ip-address[:port]> proto <protocol-type> level <level>
# Example
set logging-server <hostname-or-ip-address:514 proto udp level info
# Rmove logging
del logging-server <hostname-or-ip-address:514 proto udp level info
ESXi Configuration
esxcli network firewall ruleset set -r syslog -e true
esxcli system syslog config set –loghost=<hostname-or-ip-address[:port]>
esxcli system syslog reload
KVM Troubleshooting - syslog
Login as root
Create this file /etc/rsyslog.d/40-vmware-remote-logging.conf
Add this line to the file
‘.@:514;RFC5424fmt'
Restart syslog
Systemctl restart rsyslog
# Verify logging configuration
get logging-server
Monitoring Dashboards
Verify monitoring dashboards
Packet Capture
If you need detailed traffic info, use port mirroring.
Can use CLI to setup packet capture on:
1. NSX Manager
start capture interface [file ] [count ] [expression ]
2. NSX Edges
set capture session interface direction
3. ESXi
Collect packets
pktcap-uw
4. View packets
tcpdump -uw
5. KVM
Tcpdump
Troubleshooting scenarios
NSX Manager
If file corrupt check OVA or QCOW2 install files
Password requiement 12 characters minimum on password
Check logs
get cluster-status
get cluster config # Verify cluster configuration
get service <service-name>
start service <service-name>
Installation problems
NSX CLI Commands:
get services
get service
get cluster status
get configuration
nsxcli
Can see that ESXi is connected to 46, and KVM is on 47, showing the Shards are working correctly
Logical Switching
Common switching problems
N-VDS is incorrectly configured on a host
Overlay tunnel (GENEVE) is misconfigured
TEPs unable to reach each other
Validate switch
esxcfg-vswitch -l # verify switch configuration
nsxdp-cli # Verify nsx local datapath services and statitics
# Verify network interfaces
ifconfig
net-stat -I
Verification Process
# ssh to NSX manager node
get logical-switches # Verify all logical switches/segements configured in NSX manager
get logical-switch <segment-uuid> ports # verify the logical switch ports connected to the segment
get logical-switch <segment-VNI> transport-node-table # list the transport node table of the segment logical switch
get logical-switch <segment-VNI> arp-table
get logical-switch <segment-VNI> map-table
get logical-switch <segment-VNI> vtep
get nodes # list all the transport nodes
## ssh to ESXi host
nsxcli # enter nsxcli command mode
get logical-switches # It will list the switches VNI, UUID, DVS name, VIF numbers
get logical-switch <segment-VNI>
get logical-switch <segment-VNI> map-table
get logical-switch <segment-VNI> arp-table
get logical-switch <segment-VNI> vtep-table
## ssh to KVM host
sudo -i # enter root mode
virsh dumpxml <vm-name> | grep interfaceid # obtain the interfaceid of the required vm
nsxcli # enter nsxcli command mode
get logical-switches # It will list the switches VNI, UUID, DVS name, VIF numbers
get logical-switch <segment-VNI>
get logical-switch <segment-VNI> ports
get logical-switch <segment-VNI> map-table
get logical-switch <segment-VNI> arp-table
get logical-switch <segment-VNI> vtep
Check GENEVE VMKernel
esxcli network ip interface ipv4 | get vmk10
vmk10 is the TEP for NSX
esxcli network ip interface ipv4 | get vmk50
vmk50 is for intra-tier networking/routing and containers.
Verifying overlay tunnel reachability
Ping destination TEP interface from the source host
vmkping ++netstack=vxlan -s Vxlan is used by host rather than GENEVE. It's the same stack for ESXi.
Try 1572 if 1575 fails This is the minimum size needed to support GENEVE. GENEVE adds 72 bytes to a 1500 byte data packet.
If 1572 fails try 1472 if that works, the overhead for the overlay hasn’t been configured.
Example
vmkping ++netstack=vxlan -s 1572 -d <TEP-IP> # using 1572 data bytes, and ping destination TEP
N-VDS Not Initialised on a Host
If a VM is not able to communicate on a specific host, check that the segment is present, if it isn’t showing on the host, go into the GUI, and check the N-VDS segment is present. If it is, check the advanced settings virtual switches and look for any errors like Partial Success, or other information.
If this happens, check that the agents are running on the host.
/etc/init.d/nsx-mpa status
esxcli network ip connection list grep 5671
/etc/init.d/nsx-proxy status
esxcli network ip connection list grep 1235
/etc/init.d/nsx-opsagent status
# KVM host
service nsx-proxy status
netstat -nap | grep 1234
netstat -nap | grep 1235
Routing Problems
- Check if BGP neighbours are not misconfigured and as a result the neighbour relationship is not established.
- Check the internal route advertisement on the Tier-1 router is misconfigured
- Route redistribution on the Tier-0 router is misconfigured
Especially those check boxes!
Check Routing Table get logical-router
Check the SR for routing
Validate the routing table for the Tier-0
Check SR
Check VRF
get route b = BGP
DR (Distributed Routing)
For DR check the forwarder for similar information get forwarding
BGP neighbour
get bgp neighbor summary
Check the status is established # Need status as "established"
where
Active means still setting up!
BGP route table
Tier0 SR can show BGP route info
get bgp ipv4
Logical routers verification process
## ssh to NSX manager
get logical-routers
get logical-routers | find <router-name> # It will show distributed router (DR) or service router (SR)
## ssh to ESXi host
nsxcli # enter nsxclic command mode
get logical-routers
get logical-router <logical-router-uuid> # show logical router information, such as LIF number, state
get logical-router <logical-router-uuid> interfaces # show LIF uuid, overlay VNI, IP and netmask
# exit the nsxcli command mode, back to root user command prompt
exit # exit nsxcli command mode
net-vdr -I --brief -l # list distribute router UUID, LIFs, routes
# This command is equivalent in nsxcli "get logical-routers"
net-vdr -I --brief -l <router-uuid>
## ssh to KVM host
sudo -i
nsxcli # enter esxcli command mode
get logical-routers
get logical-router <logical-router-uuid>
## ssh to NSX edge node
get logical-routers # list the logical routers
vrf <DR/SR-vrf-id>
get bgp neighbor summary # list router id, local AS, neighbor IP, remote AS, state
get route # list the routes learned from bgp
get route connected # list the directly connected routes
get route bgp # list routes learned from bgp
get interfaces # display details of the logical router interfaces
get forwarding # display the logical router forwarding table, such as gateway IP, MAC
Firewall
Most common firewall issues are
Firewall policy rules are configured but not enabled or published
Firewall policy rules are not applied to the intended entity
The sequence of rules is incorrect, remember it’s top to bottom, left to right (categories)
1. KVM
get firewall status summary
Ovs-appctl used for configuration of Firewall.
Validate with
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/vif
Get the VIF then type
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/rules Rules are defined with addrsets (address sets). These have GUIDs on them as well.
2. ESXi
Use vsipioctl and summarize-dvfilter Summarize-dvfilter | grep
Look for the filter name then use vsipiolctl getrules -f
The example adds the -A16 variable
which tells grep to add 16 lines to the output.
This is without the -A16 and with
Can also use the addrsets in the filter instead of name.
Same commands again but with -f addrset number.
The edges give definition of what’s in the rule sets
using get firewall ruleset rules
You get the interface_id by running get firewall interfaces
Distributed Firewall verification process
# ssh to ESXi host
summarize-dvfilter | grep -A<number-of-lines> <VM-name> # Retrieve the name of the dvfilter associated with the vNIC of the VM
vsipioctl getrules -f <dvfilter-name> # get the distributed firewall rule associated with a dvfilter
vsipioctl getaddrsets -f <dvfilter-name> # get IP and MAC address associated with the distributed firewall rule for a dvfilter
vsipioctl getfwconfig -f <dvfilter-name> # get the distributed firewall configuration
# this command provides the combined output of getrules and getaddrsets
# ssh to KVM host
sudo -i # enter root
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/vif
# get the virtual interface identifier fro the vNICs that have associated distributed firewall rules on the KVM host
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/vif <VIF-id>
# get the distributed firewall rule associated with a virtual interface
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/addrset <addrset>
# get the IP and MAC address associated with the distributed firewall rule for a dvfilter
# ssh to Edge node
get firewall interfaces # get all edge interfaces that have firewall rule configured
get firewall <uuid> ruleset rules # get firewall rules associated with the uplink interface
nsxdp-cli
Can get deeper analysis with nsxdp-cli
Again, this command only shows 1 line as the -A1 is used in the egrep
Edge Validation
nsxcli
get configuration
get node-uuid
get interfaces
get managers
get host-switches
get tunnel-ports
get vtep
Verify network flow and packet
tcpdump
# KVM
Open vSwitch module must be installed on KVM hypervisor host before the host can be prepared and configured as a transport node
NSX-T CLI References
https://www.simongreaves.co.uk/nsx-t-cli-reference-guide/
NSX CLI (nsxcli) is the command line tool for troubleshooting NSX-T. It’s run in a non-root mode so you have to use the command structure available. For instance there is no grep, But you can use find and pipe instead.
get logical-switches | find segments
You can use the nsxcli command line tool from various elements throughout the NSX-T deployment including NSX Manager, Edges and ESXi Transport Nodes.
From Edges
SSH Access to Edges, or nodes Post deployment, open the console
get service ssh # verify the SSH service is stopped.
start service ssh # Start the SSH service.
set service ssh start-on-boot # Set the SSH service to autostart when the VM is powered on.
get service ssh # Verify that the SSH service is running and Start on boot is set to True.
set cli-timeout 0 # Disable the command-line timeout.
get logical-routers # Obtain routing information for the gateways.
# Verify that the SERVICE_ROUTER_TIER0 service gateway appears with an associated VRF ID.
vrf
Vrf is used to access the gateway virtual routing functions. For BGP and other Tier-0 services you access the Tier-0 service router (SR) function of the gateway.
vrf <vrf_ID> vrf 2
get bgp neighbor summary # verify the BGP state
# Important: Check the status is established. A status of Active means still setting up!
get bgp neighbor # View further information on the BGP connection. Also shows whether the connection is established or not.
get bgp ipv4 # View ipv4 bgp information.
Press q to quit out of BGP neighbor output. #Exit the Tier-0 VRF service gateway mode.
Route Information from the Edges
get logical-routers
vrf <vrf_id>
get route # Shows the routes learned from the BGP peer
For DR check the forwarder for similar routing information.
get forwarding
Connect to the SR to collect that forwarding information, also get it from the DR. This is useful for seeing routing configuration for DR components throughout the environment, such as those on a Tier-1 DR.
DHCP
Runs on Edges.
get dhcp servers
get dhcp ip-pools
get dhcp leases
Load Balancer
# Load balancer commonly deployed as inline toplogy or one-arm topology
get load-balancer
# The output shows the general load balancer configuration, including UUID and Virtual Server ID.
get load-balancer UUID virtual-server <Virtual_Server_ID>
# Copy the UUID and the Virtual Server ID values and paste them. Verify the virtual server configuration.
get load-balancer UUID pools
# Verify the server pool configuration, UUID is the value that you recorded for the load balancer.
Load balancer verification
# ssh to NSX edge node
get load-balancer # verify load balancer configuration
get load-balancer <uuid> virtual-server # verify virtual server configuration
get load-balancer <uuid> virtual-server <vs-uuid> status
get load-balancer <uuid> virtual-server <vs-uuid> stats
get load-balancer <uuid> pools
get load-balancer <uuid> error-log # get the error log
VPN Connectivity Tests on Edges
get ipsecvpn session active
# Verify that the L2VPN session is active, identify the peers, and ensure that the tunnel status is up.
get ipsecvpn session status
# Verify that the sessions are up.
get ipsecvpn session summary
# Check whether the ipsecvpn session is up between the local and remote peers.
get ipsecvpn ikesa <session-id>
get ipsecvpn tunnel stats
get l2vpn service config
get l2vpn sessions
# Get the l2vpn session, tunnel, and IPSEC session numbers, and check that the status is UP.
get l2vpn session stats
# Get statistical information of the local and remote peers, whether the status is UP, count of packets received, bytes received (RX), packets transmitted (TX), and packets dropped, malformed, or loops.
get l2vpn session config
# Get the session configuration information.
Layer 2 VPN verification process
# ssh to NSX Edge node
get ipsecvpn session summary # list all IPsec VPN sessions
get ipsecvpn session sessionid <session-id> # list session information, such as tunnel
get l2vpn sessions # list Layer2 VPN sessions
get l2vpn sessions config # show the L2 VPN sessions configuration
get l2vpn session <uuid> logical-switches # show L2 VPN session's logical switches
NSX Manager
set user <username> [password <password> [old-password \<old-password>]]
# To change the password of an account run:
get certificate api thumbprint # Obtain NSX manager certificate thumbprint
Authentication Policy Settings for Local Users
Use the following to set:
1. Password length
set auth-policy minimum-password-length <password-length>
2. UI and API authentication policies. The UI and API local users have the same policy.
set auth-policy api lockout-period <lockout-period>
set auth-policy api lockout-reset-period <lockout-reset-period>
set auth-policy api max-auth-failures <auth-failures>
3. Set CLI authentication policy
set auth-policy cli lockout-period lockout-period <lockout-period>
View Logs
# NSX CLI
get log-file policy.log
# Engineering Mode
Use st en to enter engineering mode (root privileged mode)
Syslog - Manager and Edges
set logging-server <hostname-or-ip-address[:port]> proto \<protocol> level <level>
Transport Nodes
Logical Switches
#* List all logical switches, it will shows VNI, UUID, Name, Type
get logical-switches
#* List all transport nodes associated with a logical switch.
get logical-switches <switch_UUID> transport-node-table
#* List all TEPs associated with a logical switch.
get logical-switches <switch_UUID> vtep
#* List MAC table associated with a logical switch.
get logical-switches <switch_UUID> mac-table
#* List ARP table associated with a logical switch.
get logical-switches <switch_UUID> arp-table
#*** On ESXi host to retrieve the VTEP, MAC and ARP entries
get logical-switch <VNI-uuid> vtep-table
get logical-switch <VNI-uuid> mac-table
get logical-switch <VNI-uuid> arp-table
Deploy Manager Cluster
To deploy a manager to an existing cluster, get the cluster configuration ID, make a note of the existing Managers certificate thumbprint and use that to join the new node to the cluster. Finally get the cluster status to confirm the new host has joined
get cluster config
join <NSX-Manager-IP> cluster-id <cluster-id> username<NSX-Manager-username> password<NSX-Manager-password> thumbprint <NSX-Manager1's-thumbpint>
get cluster status
OR via API
POST https://<nsx-mgr>/api/v1/cluster?action=join_cluster
ESXi Configuration
Several esxcli commands can be used to aid in NSX-T configuration.
esxcli network firewall ruleset set -r syslog -e true
esxcli system syslog config set --loghost=<hostname-or-ip-address[:port]>
esxcli system syslog reload
Can also use the nsxcli command set such as:
get logical-switches # It shows Overlay Kernel Entry - VNI, DVS Name, VIF number
# Also Overlay LCP Entry - VNI, Logical Switch UUID
get logical-switch <UUDID> # View detail of specific logical switch
N-VDS and Tunnel Information
#***** ESXi Host
# Verify the kernel modules installed
esxcli system module list | egrep -i 'vswitch|vdl2|ens'
# Verify TEP
esxcli network ip interface ipv4 address list
esxcfg-vswitch -l # Verify switches and N-VDS configuration
esxcfg-vmknic -l # Verify TEP interfaces configuration
esxcfg-nics -l # verify the status of the uplinks
esxcli network nic up -n vmnic3 # bring up vmnic3
# Show details of N-VDS configuraton
net-dvs # very detail information
# Show logical switch and N-VDS info
# Include - summary, Overlay Kernel Entry (VNI, DVS Name, VIF num), Overlay LCP Entry (VNI, UUID)
get logical-switch
# Display the status of the overlay tunnels Note N-VDS’s used to be called host switches, hence the get host-switch command.
get-host switch <N-VDS_NAME> tunnel
#****** KVM Host
ovs-vsctl list open_vswitch # display the Open vSwitch configuration
ovs-vsctl show # list the NSX bridges installed on the KVM hosts
KVM Configuration
Syslog
Login as root
Create this file
/etc/rsyslog.d/40-vmware-remote-logging.conf
Add this line to the file
'*.*@<syslog_server_ip>:514;RFC5424fmt'
Restart syslog
systemctl restart rsyslog
nsxcli can also be used as outlined above. ESXi gives a bit more info as the kernel info is available in ESXi that isn’t there for KVM.
Verify the VM running state
1. ssh to KVM host
2. Verify VM running state
sudo virsh list -all # list all VM running state
3. Power on VM
sudo virsh start <VM-name>
Packet Capture
Can use CLI to setup network packet capture on:
1. NSX Manager
start capture interface <interface-name> [file <filename>] [count <packet-count>] [expression <expression>]
2. NSX Edges
set capture session <session-number> interface <port-uuid> direction <direction>
Example.
set capture session 1 interface fp-eth1 direction in
set capture session 1 expression src net 172.20.10.0/24
3. Removed captured session information with:
del capture session 1
ESXi
# Collect packets. Can send to a file.
pktcap-uw # pktcap-uw --help
# Can Pipe it to view captured packets on the screen.
pktcap-uw | tcpdump -uw
# To review captured packets on the ESXi host
tcpdump-uw
KVM
tcpdump
Verify installation problems
get services
get service <service name>
get cluster status
get configuration
get managers
Cluster Configuration Validation
NSX Manager nodes (Logs)
get cluster status (nsxcli) get services (nsxcli) get log-file (nsxcli) Login as root (Linux)
DATASTORE datastore - /var/log/corfu/corfu.9000.log
CLUSTER_BOOT_MANAGER cluster_manager - /var/log/cbm/cbm.log
CONTROLLER controller - /var/log/cloudnext/nsx-ccp.log
MANAGER manager manager.log /var/log/proton/nsxapi.log
POLICY policy policy.log /var/log/policy/policy.log
HTTP http http.log /var/log/proxy/reverse-proxy.log
- - syslog /var/log/syslog
Example to verify cluster service and start the service
get service http
start service http
Process to detach a failed NSX manager node
get cluster status # Verify the status as DEGRADED, and note down the failed node UUID
detach node <failed-node-uuid>
get cluster status # Ensure the cluster status as STABLE
get certificate api thumbprint # retrieve the NSM manager thumbprint
get cluster config # retrieve the cluster ID
# Then ssh to the new NSX manager node, and join the new NSM manager node to NSX manager cluster
join <NSX-node-IP> cluster-id <cluster-id> thumbprint <NSX-thumbprint> username <admin> password <password>
get cluster status # verify cluster status as STABLE
Transport Node Preparation
Verify transport node preparation
# ESXi host
esxcli software vib list | egrep "nsx|vsip" # verify NSX-T data center packages installed on the esxi host
esxcli system module list | grep nsx # verify the kernel module installation
esxcli network ip interface ipv4 address list # list the VMkernel IPv4 address list
esxcli network ip netstat list # list the TCP/IP stacks available on the transport node
esxcfg-vswitch -l # list the vSwitch available on the ESXi transport node
/etc/init.d/nsx-proxy status # Verify the NSX-Proxy agent service running status
esxcli network ip connection list | egrep "1234|1235" # Verify the connections are established
# KVM host
dpkg --list | grep nsx
ifconfig # verify IPv4 address
ovs-vsctl show # Verify Open vSwitch configuration
service nsx-proxy status # Verify NSX proxy running status
netstat -nap | grep 1234
netstat -nap | grep 1235
vSphere
Verify ESXi Hosts
# Check all VIBs / software installed on vSphere
esxcli software vib list | grep -e nsx -e vsip
esxcfg-module -l | grep nsx # This may work
# Check ESXi agenets status
/etc/init.d/nsx-proxy status
esxcli network ip connection list | grep 1234
esxcli network ip connection list | grep 1235
KVM
# Ubuntu
dpkg --list | grep nsx
# Redhat
rpm -qa | grep nsx
Check TEP and Hyperbus
Hyperbus is for containers.
# verify TEP
esxcli network ip interface ipv4 address list
# vmk10 TEP interface
esxcli network ip netstack list # verify the TCP/Ip stacks used by TEP and hyberbus interface
# TEP interfacce uses vxlan
esxcfg-vswitch -l # verify N-VDS configuration
vSphere
# Check ip v4 address list
esxcli network ip interface ipv4 address list
# Output - Overlay (TEP and vmk10 (default vmk)). Hyperbus (vmk50 default vmk)).
# Verify TCP/IP for TEP and hyberbus.
esxcli network ip netstack list
# Vxlan is GENEVE in ESXi.
KVM
# Check network interface and address
ifconfig
# verify ethx, and nsx-vtep0.0
# Check open vswitch
ovs-vsctl list Open_vSwitch
# It shows UUID of bridges,, bridges, datapath_types,ovs_version (Open switch version), system_type (Host OS type and version)
ovs-vsctl show
# It shows information, such as "Bridge nsx-managed", is_connected
# Port hypberbus: Interface hypberbus # The hyperbus interface is connected to the nsx-managed bridge, manages the Containers
# Bridge "nsx-switch.0" # The first bridge
# Port "nsx-vtep0.0"
# Interface "nsx-vtep0.0" # The nsx-vtep 0.0 interface is connected to the nsx-switch.0 bridge,
# and is responsible for encapsulating and decapsulating the overlay traffic
dpkg --list | grep nsx # list the installed nsx modules
service nsx-proxy status # verify nsx agent status
Agents and Connectivity
ESXi
/etc/init.d/nsx-mpa status
esxcli network ip connection list | grep 5671
/etc/init.d/nsx-proxy status
esxcli network ip connection list | grep 1235
/etc/init.d/nsx-opsagent status
KVM
service nsx-mpa status
netstat -nap | grep 5671
service nsx-proxy status
netstat -nap | grep 1235
Other commands
ESXi
type the esxcli command.
esxcli network ip connection list | grep 1235
KVM
netstat -anp –tcp | grep 1235
Verfiy NSX Edge installation and configuration
get configuration # verify configuration
get managers # list NSX manager nodes
get node-uuid
get interfaces
get host-switches
get tunnel-ports
get vteps
Checking Communication from Host to Controller and Manager
1. ESXi - Verify output for status
# On an ESXi host using NSX-T CLI commands:
get managers
get controllers
2. KVM
# On a KVM host using NSX-T CLI commands:
get managers
get controllers
View details of N-VDS, the N-VDS has its own command line tool, net-vdl2.
# View detail information, such as NsX VDS name, VDS ID, VTEP interface, logical network, Controller ip and status
net-vdl2 -l # https://kb.vmware.com/s/article/66796
ESXi LIF MAC View The LIF (Logical Interface) vMAC to pMAC on ESXi host.
# Detail information, such as DvsName, NumLifs, DRvMAC (vMAC), pMAC, uplink, team or non-team member
net-vdr -C -l
# Note: 02:50:56:56:44:52 is always the vMAC for the LIFs for all DRs.
# To view the DR instance information on an ESXi host
net-vdr -l -I # -l "l" for list
# -I "I" for instance
Verify Infrastructure Communication Events
Infrastructure communication events arise from the NSX Edge, KVM, ESXi, and public gateway nodes.
# ssh to Edge node, verify status
nsxcli get tunnel-ports
# On each tunnel, check the stats for any drops
get tunnel-port <port-uuid> stats
# Verify service
get services
start service <service-name> # if service stops
Verify node agent health
### For ESX
# Verify vmk50, if missing, recreate it
# if Hyperbus 4094 is missing,
restart nsx-cfgagent
# if nsx-cfgagent has stopped
restart nsx-cfgagent
### For KVM
# if Hyperbus namespace is missing
restart nsx-cfgagent
# if nsx-agent has stopped
restart nsx-agent
DFW Validation
Distributed Firewall Configuration Validation
# Distributed firewall policies are divided into five default categoires
Ethernet -> Emergency -> Infrastruture -> Environment -> Application
NSX Manager
get firewall status
get firewall pubished-entities # List the rules pusblished to CCP
ESXi
nsxt-vsip is the DFW module.
# Run commands to verify status
/etc/init.d/nsx-mpa status # NSX-Management-Plane-Agent
/etc/init.d/nsx-proxy status # nsx-proxy agent service running status
/etc/init.d/nsx-opsagent status # opsAgent running status
esxcli system module list | grep nsxt-vsip # Verify nsxt-vsip running status
esxcli network ip connection list | grep 5671 # Verify tcp 5671 for mpa established status
esxcli network ip connection list | grep 1235 # Verify nsx-proxy established status
Time based rules require NTP service
/etc/init.d/ntpd status # Check NTP service
ntpd -p # Verify NTP associations on the ESXi host
Check dvFilter for firewall rules.
summarize-dvfilter | grep <VM_NAME>
summarize-dvfilter | grep -A <portnumber> <VM_Name> # Run summarize-dvfilter commmand, to list the port number and dvFilter name of a VM
vsipioctl getrule -f <vNIC> # Verify the firewall rule appliedd to the vNIC of the virtual machine
KVM
KVM Host firewall configuration verification
# Use these to validate the distributed firewall Settings. View app firewall virtual interfaces.
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/vif
# View firewall rules with containing addrsets.
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/rules <VIF_ID_NUMBER
Gateway Firewall Validation
Gateway Firewall predefined categories
Emergency -> System -> Pre Rules -> Local Gateway -> Auto Service Rules -> Default
Where
Pre Rules # globally applied across to all NSX gateway nodes
Useful firewall operation and troubleshootings
## On ESXi host
vsipioctl -h # help menu
# Get the list of VMs on the ESXi host and associated filter name
summarize-dvfilter | grep -A 3 vmm
# Get the firewall rules applied to a VM
vsipioctl getrules -f <filter-name>
# Example: vsipioctl getrules -f nic-7014985-eth0-vmware-sfw.2
# Get stats per FW rule per VM VNIC
# Use "-s" with the above command to get the firewall stats associated with the VM firewall rules.
vsipioctl getrules -f nic-7014985-eth0-vmware-sfw.2 -s
# Get the addrset/groups used in the VM's Firewall rules
# The firewall rule uses groups/addrset in the Source or destination. This output gets the all the addrset used in the rules based on the grouping configuration
vsipioctl getaddrset -f nic-1371516-eth0-vmware-sfw.2
# Get the active Firewall flow per VM
# NSX DFW maintains active flow per VNIC. This output gets the all the active flows over that VNIC.
vsipioctl getflows -f nic-7014985-eth0-vmware-sfw.2
# et the active Full Firewall config per VM
# This output provides full firewall config per VNIC- Rules, Addrset & Profiles used.
vsipioctl getfwconfig -f nic-7014985-eth0-vmware-sfw.2
### NSX CLI for firewall troubleshooting
nsxcli # enter nsxcli command mode
get firewall <enter> # list all get firewall command options
get firewall <interface_uuid> interface stats | json # check flow table usage
get firewall packetlog last 10 # get the last 10 packetlog entries
get firewall exclusion
get firewall thresholds
### Troubleshooting Distributed Firewall on KVM Hosts
get firewall vifs # list all VIFs
get firewall <vif-uuid> ruleset rules # Discover firewall rules that apply to a specific VIF
get firewall <vif-uuid> addrsets # Get the list of address sets used in a specific VIF
get firewall <vif-uuid> profile # Get the list of APPIDs and FQDNs used in a specific VIF
get firewall <vif-uuid> fqdn # Discover FQDN of specific VIF
ovs-appctl dpctl/dump-conntrack -m | grep <source-ip>| grep <dest-ip> # look for flows between two specific IP addresses
### Troubleshooting Gateway Firewall
# Gateway firewall is implemented on NSX Edge transport node
get logical-router # Get UUID of the Gateway on which Firewall is enabled
# Get all Gateway interfaces using UUID
# Gateway firewall is implemented per Uplink interface of a Gateway. Identify the uplink interface and get the interface ID from the output below.
get logical-router <gateway-uuid/SR-uuid/DR-uuid> interfaces
get firewall <GW-interface-uuid> ruleset rules # Get Gateway Firewall Rules on a GW Interface
# Check Gateway Firewall Sync status
# Gateway Firewall sync flow status between Edge Nodes for high availability. Gateway firewall sync config can be seen using the output below
get firewall <GW-interface-uuid> sync config
# Check Gateway Firewall Active Flows
get firewall <GW-interface-uuid> connection
# Check Gateway Firewall Logs
# Gateway firewall logs provide the gateway VRF and GW Interface information, along with flow details.
# Gateway firewall logs can be accessed on the edge, or can be sent to Syslog Server.
# Firewall logs provide the logical router VRF, firewall interface ID, FW rule ID & flow details.
get log-file syslog | find datapathd.firewallpkt
# Other Command Line Options for debugging Gateway Firewall
get firewall <GW-interface-uuid> <ENTER>
#### Distributed Firewall Packet Logs
/var/log/dfwpktlogs.log # The log file is for both ESXi and KVM hosts
Post Upgrade
Post upgrades, use these CLI commands to validate that NSX-T has been upgraded successfully and the correct NSX-T modules are installed.
vSphere
esxcli software vib list | grep nsx
KVM
# Ubuntu
dpkg -l | egrep 'nsx|openvswitch'
# Red Hat
rpm -qa | egrep 'nsx|openvwitch'
# nsxcli
get version
Simple Network Management Protocol (SNMP)
You can use Simple Network Management Protocol (SNMP) to monitor your NSX-T Data Center components. The SNMP service is not started by default after installation.
Download and install the file VMware-NSX-MIB.mib
### SNMP configuration - NSX manager CLI or NSX Edge CLI
# For SNMPv1/SNMPv2
set snmp community <community-string>
start service snmp
# For SNMPv3
set snamp v3-user <username> auth-password <auth-password> priv-password <priv-password>
start service snmp
Port Mirroring in Manager Mode
Note that logical SPAN is supported for overlay logical switches only and not VLAN logical switches.
For a local SPAN session, the mirror session source and destination ports must be on the same host vSwitch.
## Port mirroring session
# select session type, the available types are
Local SPAN # Select transport node
Remote SPAN # Sessoin Type - RSPaN source session, or RSPAN Destination session
# select transport node
# Encapsulation VLAN ID
Remote L3 SPAN # Encapsulation - select GRE, ERSPAN TWO, or ERSPAN TREE
Logical SPAN # Logical switch (overlay logical switch only, NOT VLAN logical switch)
NSX-T Command Line Cheat Sheet
https://www.simongreaves.co.uk/nsx-t-command-line-cheat-sheet/